Method and system for automatically gathering information from different types of devices connected in a network when a device fails

ABSTRACT

A method and system for automatically gathering information from different types of devices connected in a network when a device fails includes a network connecting a plurality of different types of devices which communicate with one another across the network. A network controller determines which devices in the network are capable of saving failure information during the time when one of the devices encounters an error. The network controller determines when one of the devices encounters an error. The network controller then gathers the failure information from the devices upon determining that one of the devices has encountered an error.

TECHNICAL FIELD

The present invention generally relates to a method and system forautomatically gathering information from different types of devicesconnected in a network when a device fails such that the information canbe used to determine the solution for the failing device.

BACKGROUND ART

A network may connect many different types of devices. Some of thedevices may be located remotely from a central point such as a controlsite of the network. When a problem occurs within one of the devices,the problem often times is found to be inconclusive due to lack ofinformation. Often, in order to determine the cause of the problem andthe solution for the problem, it is necessary to analyze informationfrom the other devices during the time when the problem occurred withinthe one device.

SUMMARY OF THE INVENTION

Accordingly, it is an object of the present invention to provide amethod and system for automatically gathering information from differenttypes of devices connected in a network when a device fails such thatthe information can be used to determine the solution for the failingdevice.

It is another object of the present invention to provide a method andsystem for automatically gathering debug and failure information fromdifferent types of devices connected in a network when a deviceencounters an error in which a network controller becomes aware of theerror and then gathers the debug and failure information from the otherdevices at the time of the error.

It is a further object of the present invention to provide a method andsystem for automatically gathering debug and failure information fromdifferent types of devices connected in a network when a deviceencounters an error in which a network controller determines whichdevices in the network are capable of saving debug and failureinformation and then gathers the debug and failure information from suchdevices when a device error occurs.

In carrying out the above objects and other objects, the presentinvention provides a system having a network connecting a plurality ofdifferent types of devices which communicate with one another across thenetwork. The system further includes a network controller fordetermining which devices in the network are capable of saving failureinformation during the time when one of the devices encounters an error,for determining when one of the devices encounters an error, and forautomatically gathering the failure information from the devices upondetermining that one of the devices has encountered an error.

In different embodiments of the present invention, the networkcontroller is operable for transmitting a save command to the devicesfor saving failure information upon determining that one of the deviceshas encountered an error. The network controller is operable for pollingthe devices to determine when one of the devices has encountered anerror. The network controller automatically gathers failure informationfrom the devices by reading the failure information from the devices.

In another embodiment of the present invention the devices are operablefor transmitting failure information to the network controller. Thenetwork controller is operable for transmitting a transmit command tothe devices for instructing the devices to transmit the failureinformation to the network controller for automatically gathering thefailure information upon the network controller determining that one ofthe devices has encountered an error.

In carrying out the above objects and other objects, the presentinvention further provides a method for a network having a plurality ofdifferent types of devices which communicate with one another across thenetwork. The method includes determining which devices in the networkare capable of saving failure information during the time when one ofthe devices encounters an error. The method then determines when one ofthe devices in the network encounters an error. The failure informationfrom the devices is then automatically gathered at a network controllerupon the determination that one of the devices has encountered an error.

The above objects and other objects, features, and advantages of thepresent invention are readily apparent from the following detaileddescription of the best mode for carrying out the present invention whentaken in connection with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a network having a plurality of different types ofdevices in accordance with the method and system of the presentinvention; and

FIG. 2 illustrates a flowchart describing operation of the method andsystem of the present invention.

BEST MODES FOR CARRYING OUT THE INVENTION

Referring now to FIG. 1, a network 10 in accordance with the method andsystem of the present invention is shown. Network 10 connects aplurality of different devices together for the devices to communicatewith one another. As shown in FIG. 1, network 10 connects a device of afirst type 12, a device of a second type 14, and a device of a thirdtype 16 together so that each of the devices can communicate with oneanother to perform their respective functions. Devices 12, 14, and 16may be generally located remote from one another. Devices 12, 14, and 16include a wide assortment of device types such as personal computers,printers, storage devices, storage configurations, handheld computerdevices, wireless devices, and the like. Network 10 includes local andwide area networks, a fibre fabric network, an Internet protocolnetwork, a storage area network, and the like. Some of devices 12, 14,and 16 are operable for saving information such as processing tracesduring operation. Such information can provide insight as to why anotherdevice on the network has encountered an error and how to solve theproblem causing the error. This information is referred to as debug andfailure information.

Network 10 also connects a network controller 18 to devices 12, 14, and16. Network controller 18 generally monitors the conditions of each ofdevices 12, 14, and 16 to determine when one of the devices encountersan error or fails during operation. When one of devices 12, 14, and 16encounters an error indicative of a problem occurring within the onedevice, it is often necessary to analyze debug and failure informationat the time of the error from the other devices to determine the causeand solution of the problem with the device encountering the error. Tothis end, network controller 18 is operable to generate a list ofdevices 12, 14, and 16 that are capable of storing debug and failureinformation. Network controller 18 is further operable to communicatewith devices 12, 14, and 16 to know when a device has encountered anerror.

Upon determining that one of devices 12, 14, and 16 has encountered anerror, network controller 18 commands the devices on the list to eachstore the debug and failure information at the time of the error.Network controller 18 then automatically gathers the debug and failureinformation from each of devices 12, 14, and 16 on the list. Networkcontroller 18 gathers the debug and failure information by reading suchinformation from each of listed devices 12, 14, and 16. Devices 12, 14,and 16 may transmit the debug and failure information to networkcontroller 18. The debug and failure information from the listed devices12, 14, and 16 may then be analyzed to determine the cause and solutionof the error encountered by the failing device. Of course, networkcontroller 18 could also be a device such as devices 12, 14, and 16 inwhich the network controller performs device and controller functions.

Referring now to FIG. 2, with continual reference to FIG. 1, a flowchart 20 illustrating operation of the method and system of the presentinvention is shown. Flow chart 20 begins with connecting different typesof devices 12, 14, and 16 within a network 10 as shown in block 22. Anetwork controller 18 then determines which devices 12, 14, and 16 arecapable of saving debug and failure information as shown in block 24.Network controller 18 then generates a list of devices 12, 14, and 16that are capable of saving the debug and failure information as shown inblock 26.

Network controller 18 then monitors devices 12, 14, and 16 to determineif one of the devices encounters an error during operation as shown inblock 28. Upon a device encountering an error, network controller 18notifies all devices 12, 14, and 16 on the list to save the debug andfailure information during the time that the one device encountered anerror as shown in block 30. Network controller 18 then receives thesaved debug and failure information from listed devices 12, 14, and 16as shown in block 32. Network controller 18 may read the savedinformation from listed devices 12, 14, and 16 or the listed devices maytransmit the information to the network controller. The gathered debugand failure information is then analyzed to determine the cause andsolution for the device encountering the error as shown in block 34.

Thus it is apparent that there has been provided, in accordance with thepresent invention, a method and system for automatically gatheringinformation from different types of devices connected in a network whena device fails that fully satisfy the objects, aims, and advantages setforth above. While the present invention has been described inconjunction with specific embodiments thereof, it is evident that manyalternatives, modifications, and variations will be apparent to thoseskilled in the art in light of the foregoing description. Accordingly,it is intended to embrace all such alternatives, modifications, andvariations as fall within the spirit and broad scope of the appendedclaims.

What is claimed is:
 1. A system for automatically gathering informationfrom different types of devices connected in a network when a device inthe network fails, the system comprising: a network connecting aplurality of different types of devices which communicate with oneanother across the network, wherein at least one of the devices in thenetwork is capable of saving debug and failure information indicative ofoperation of the respective at least one device during the time when adevice in the network encounters an error; and a network controller fordetermining the identity of the at least one of the devices in thenetwork which are capable of saving the debug and failure informationduring the time when one of the devices in the network encounters anerror, for determining when and which one of the devices encounters anerror, and for gathering the saved debug and failure information fromthe at least one of the devices upon determining that one of the deviceshas encountered an error, the saved debug and failure informationgathered from the at least one of the devices being used to determine acause and a solution of the encountered error; wherein, upon determiningthat one of the devices has encountered an error, the network controlleris operable for transmitting a save command to the at least one of thedevices in the network for commanding the at least one of the devices tosave the debug and failure information where the devices save debug andfailure information at the system level regardless of the softwarerunning on the devices.
 2. The system of claim 1 wherein: the networkcontroller is operable for polling the devices to determine when one ofthe devices has encountered an error.
 3. The system of claim 1 wherein:the network controller automatically gathers debug and failureinformation from the devices by reading the debug and failureinformation from the devices.
 4. The system of claim 1 wherein: thedevices are operable for transmitting the saved debug and failureinformation to the network controller, wherein the network controller isoperable for transmitting a transmit command to the devices forinstructing the devices to transmit the saved debug and failureinformation to the network controller for gathering the saved debug andfailure information upon the network controller determining that one ofthe devices has encountered an error.
 5. The system of claim 1 wherein:the network controller is one of the plurality of devices.
 6. A methodfor a network having a plurality of different types of devices whichcommunicate with one another across the network, the method comprising:configuring at least one of the devices in the network to be capable ofsaving debug and failure information indicative of operation of therespective at least one device during the time when a device in thenetwork encounters an error; determining the identity of the at leastone of the devices in the network which are capable of saving the debugand failure information during the time when one of the devices in thenetwork encounters an error; determining when and which one of thedevices in the network encounters an error; upon the determination thatone of the devices has encountered an error, transmitting a save commandto the at least one of the devices in the network for commanding the atleast one of the devices to save the debug and failure information wherethe devices save debug and failure information at the system levelregardless of the software running on the devices; gathering the saveddebug and failure information from the at least one of the devices at anetwork controller upon the determination that one of the devices hasencountered an error; and using the saved debug and failure informationgathered from the at least one of the devices to determine a cause and asolution of the encountered error.
 7. The method of claim 6 wherein:determining when one of the devices encounters an error includes pollingthe devices to determine when one of the devices has encountered anerror.
 8. The method of claim 6 wherein: gathering the saved debug andfailure information from the devices includes reading the saved debugand failure information from the devices with the network controller. 9.The method of claim 6 wherein: gathering the saved debug and failureinformation from the devices includes transmitting a transmit command tothe devices for instructing the devices to transmit the saved debug andfailure information to the network controller upon the determinationthat one of the devices has encountered an error.
 10. A method for anetwork having a plurality of different types of devices whichcommunicate data with one another across the network, the methodcomprising: configuring at least one of the devices in the network to becapable of saving debug and failure information indicative of operationof the respective at least one device during the time when a device inthe network encounters an error; generating a list of the devices in thenetwork which are capable of saving the debug and failure informationduring the time when one of the devices in the network encounters anerror; polling the devices from a network controller to determine if adevice has encountered an error; transmitting a save command from thenetwork controller to the devices on the list instructing the listeddevices to save debut and failure information upon the networkcontroller determining that one of the devices has encountered an errorwhere the devices save debug and failure information at the system levelregardless of the software running on the devices; gathering at thenetwork controller the saved debug and failure information from thelisted devices upon the network controller transmitting the save commandto the devices on the list; and using the saved debug and failureinformation gathered from the listed devices to determine a cause and asolution of the encountered error.